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THE EDIT DISTANCE FUNCTION AND SYMMETRIZATION 



RYAN MARTIN 



Abstract. The edit distance between two graphs on the same labeled vertex 
set is the size of the symmetric difference of the edge sets. The distance 
between a graph, G, and a hereditary property, T-L, is the minimum of the 
distance between G and each G' & H. The edit distance function of "H is a 
C^ function of p S [0, 1] and is the limit of the maximum normalized distance 

^ between a graph of density p and "H. 

P^ This paper utilizes a method due to Sidorcnko {Cornbinatorica 13(1), pp. 

109-120], called "symmetrization" , for computing the edit distance function of 
various hereditary properties. For any graph H , ¥oTh(H) denotes the property 
of not having an induced copy of H. This paper gives some results regarding 

^_^ estimation of the function for an arbitrary hereditary property. This paper 

Oalso gives the edit distance function for ¥oTh{H), where i? is a cycle on 9 or 
fewer vertices. 

rH 1. Introduction 

'— ^ The study of the edit distance in graphs originated independently by Axenovich, 

Kezdy and the author [Sj, Alon and Stav ^ and, in a different formulation, by 
'^ Richer [18 . Since then, there has been a great deal of study on the edit distance 

(^ itself and on the so-called edit distance function. 

QQ 1.1. The edit distance function. The edit distance between graphs G and G 

^-H on the same labeled vertex set is \E{G)/\E{G')\ and is denoted dist(G, G"). The 

r~^ distance between a graph G and a property % is 

dist(G,H) :=min{dist(G',G") : y(G) = V^(G'), G' G H} . 



'"^ The edit distance function of a property T-L, denoted ed-u (p), measures the maximum 

^ distance of a density p graph from Ji. Formally, 

"^ (1) ednip) = Jim max{dist(G,H) : \V{G)\ = n, \E{G)\ = [pQ)J } /(^) 

H 

Cd if this limit exists. 

A hereditary property is a family of graphs that is closed under the taking of 
induced subgraphs. It is natural to study the edit distance of graphs from hereditary 
properties because if H is an induced subgraph of G and H' is an induced subgraph 
of G', then dist(iJ, H') < dist(G, G'). 

A hereditary property "H is trivial if there is an ng such that "H has no ng-vertex 
graph (hence, no n- vertex graph for n > uq). Otherwise, it is nontrivial. If H is a 
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nontrivial hereditary property, then it has an n-vertex graph for aU natural num- 
bers n. Throughout this paper, aU graph properties will be nontrivial hereditary 
properties. 

In [8^, a result of Alon and Stav [Ij is generalized to show that the limit in (fTl) 
does indeed exist for nontrivial hereditary properties and, furthermore, that is the 
limit of the expectation of the edit distance function for random graphs with the 
appropriate edge-probability: 

edn{p)= lim E[dist(G(n,p),H)]/("). 

It is explicitly shown in [5; that, for any nontrivial hereditary property "H, the 
function edf{{p) is continuous and concave down. Hence, it achieves its maximum 
at a point we define to be (p^,d^)- It should be noted that, for some hereditary 
properties, p^ might be an interval. 

For every hereditary property "H, there is a family of graphs that are minimal 
with respect to taking induced subgraphs, which we call forbidden graphs. We 
denote J-{T-L) to be the minimal (with respect to vertex-deletion) set of graphs H 
for which 

n^ Pi Forb(i?). 

If H = C\HeJ^i'H) Forb(i7), then we denote Ti, to be the hereditary property that is 
U = (^HeT{H) Forb(i?). I.e., H e T{H) if and only if iJ G J"(H). Note that U 
does not denote the complement of "H as a set. 

For background on the edit distance function, applications thereof and theoretical 
background, we direct the reader to Balogh and the author ;8j, Alon and Stav [21 
131 m [S] , Axenovich, Kezdy and the author [B] , and Axenovich and the author [7] . 
The theoretical background upon which this is based can be traced to papers by 
Promel and Steger [TH [THl HZ] , Bollobas and Thomason 0110] and Alekseev [J, 
among others. 

1.2. Main results. The main results of this paper are Theorem [T] and Theorem[2] 
but we also develop a general theory and specific techniques which enable one to 
compute the edit distance function. 

In Theorem [l] we provide bounds on the edit distance function for hereditary 
properties that forbid a clique. We later cite the fact that ed-uip) = edH{p) (in The- 
orem 10 v])) and can be applied to hereditary properties that forbid an independent 
set. 

Theorem 1. Let H be a nontrivial hereditary property such that J^{H) contains a 
complete graph and let h be the minimum positive integer such that T-L C Yovh{Kh). 
Let X be the chromatic number ofH and m be the smallest positive integer such that 
J'i'H) contains a complete multipartite graph with m parts. Clearly, X ^ '^ ^ ^- 

. r p i-p , 2p-i \ ^ , ( ^^ ■ i p 1 , 2p-i 

mm < , \ > < ed-H yp) < mm < , 1 — p 



X-l'x-1 m-lj lx-1 m-1 

In particular, 



^Forb(Kfe)(p) = 



x-1 
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In Theorem [2J equation (pi) is a trivial result, equation pi) was proven by 
Marchant and Thomason 13j. Some related results for C4 were obtained by Alon 
and Stav [3]. Thomason [20] reports that Marchant [12] has proven equation Q 
and ^. We note that the problem considered in [13] and in [12] is not edit distance 
but can be shown to be equivalent. 



Theorem 2. Let Cu denote 


the 


cycle on h vertices. 






(2) 




edForhiC3)iP) 


= 


P 
2 




(3) 




edForb(C4)(p) 


= 


p{l-p) 




(4) 




erfForb(C5)(p) 


= 


""^2' 2 1 




(5) 




edForh{Ce)iP) 


= 


mm<^p(l p), ^ I 




(6) 




edForh{C7)ip) 


= 


. fp P(l-P) 1 -Pl 

■""■is- i + P • 3 1 




(7) 




edForhiCs){p) 


= 


-{^.^} 




(8) 




edForb(C9)(p) 


= 


-{i'^} 




(9) 




edForh(Cio)iP) 


= 


■"'■■N^-V1> /-^-[lA'i^ 


Corollary 


' 3. Let Ch denote the cycle on h vertices. Then, 






(.1 


orb(Ch)'^Forb(Ch)j 


= < 


r (1, 1/2), 

(1/2, 1/4), 

(1/2, 1/4), 

(1/2, 1/4), 

(%/2-l, 3-2V2), 

(\/2-l, 3-2V2), 

(1/3, 1/6), 

[ ((V3-l)/2, (2-73)72), 


ifh = 3 
ifh^4 
ifh^5 
ifh^e 
tfh^7 
tfh^S 
tfh^Q 

ifh:^U 


3; 



The rest of the paper is organized as follows: Section |2] gives some of the general 
definitions for the edit distance function, such as colored regularity graphs. Sec- 
tionlsjgives some theorems with which the edit distance function can be estimated. 
Section |4| contains the proof of Theorem [T] Section [5] defines and categorizes so- 
called p-core colored regularity graphs introduced by Marchant and Thomason jl3| . 
Section [6| discusses the symmetrization method in general. Section [7| proves Theo- 
rem [2| regarding cycles. Section [8| gives some concluding remarks, a conjecture and 
acknowledgements. 

2. Development of the proofs 

2.1. Notation. All graphs are simple. If S and T are sets, then S + T denotes 
the disjoint union of S and T. If Gi and G2 are graphs, then Gi + G2 denotes the 
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disjoint union of the graphs and Gi V G2 denotes the join. If v and w are adjacent 
vertices in a graph, we denote the edge between them to be vw. 



2.2. Colored regularity graphs. A colored regularity graph (ORG), K, is a sim- 
ple complete graph, together with a partition of the vertices into black and white 
V{K) = YW{K) + YB{K) and a partition of the edges into black, white and gray 
EiK) = EW{K)+F.G{K)+EB{K). We say that a graph H embeds in K, (writing 
H ^^ K) if there is a function ip : V{H) — > V{K) so that if hih2 S E{H), then either 
ifiihi) = (/9(/i2) G VB{K) or (p{hi)(p{h2) e EB(iC) U F.G{K) and if /ii/i2 ^ -E(iJ), 
then either ip{hi) = ^(/is) e VW(ii:) or ip{hi)ip{h2) G EW(fi:) U EG(i^). 

For a hereditary property of graphs, we denote JC{T-L) to be the subset of CRGs 
such that no forbidden graph maps into K. That is, IC{H) = {K : H 1/^ K,\/H E 

Tin)}. 

In a CRG, K, vertex v is twin to vertex w if their neighborhoods are the same. 
That is, they are twin if (a) v and w and vw have the same color and (b) whenever 
X e V{K) — {v, w}, the edges vx and wx are the same color. 

We say that a CRG, K' is formed by the partition of a vertex v if V{K') ~ 
V{K) U {v'} where, for every x S V{K), the edge v'x has the same color in K' 
as wa; has in K. All other edges in K' inherit the same color as in K. We say 
that K" is formed by the fusion of equivalent vertices v and v' by letting V{K') = 
V{K) — {{v,v'})U{v"} where, for every x S V{K), the edge v"x has the same color 
as both vx and v'x. 

Two CRGs, K and ii'' are said to be equivalent if if' can be constructed from 
K by the partition of vertices or fusion of twin vertices. A CRG is reduced if it has 
no pair of equivalent vertices. A CRG, K' is an equipartition of CRG, K if there 
is an integer i such that each vertex in K is partitioned into exactly £ vertices. 

A CRG K' is said to be a sub- CRG of K if K' can be obtained by deleting 
vertices of K. 



2.3. The / and g functions. For every hereditary property, T-L, the function 
ed-u{p) in (fl]), measures not only the maximum normalized edit distance among 
density-p graphs but also the expectation of the normalized distance from G{n,p). 
That is, Alon and Stav |2j prove that 

edn{p)= lim E [dist(G(n,p),?i)] /Q). 

The normalized distance of G{n,p) from a hereditary property is well-defined 
because the distance from G{n,p) to H is concentrated around its mean. 

For every CRG, K, we associate two functions of p € [0, 1]. The function / is 
linear in p and g is found by the solution of a quadratic program.. Let K have a 
total of k vertices {wi, . . . , Vk}, and let Mk{p) be a matrix such that the entries 
are: 

J, if v,Vj e VW{K) U EW{K); 

[MK{p)hj = <( l-p, if v,v, e YB{K) U EB(X); 
), iiv^Vj €EG{K). 
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Then, we can express the / and g functions over the domain p G [0, 1] as follows, 
with VW = VW(V), VB = VB{K), EW = Y^N{K) and EB = YS,{K): 

(10) fK{p) = ■^[p(|VW|+2|EW|) + (l-p)(|VB| + 2|EB|)] 

niin x^Mx(|')x 

(11) QKip) - { S.t. X^l = 1 

X > 

If we denote 1 to be the vector of all ones, then fxip) = (]:l) Mx(p) (|;l)- So, 
fK{p) > 9k{j>)- 

Fact 4. The function g is invariant under equivalence classes of CRGs. That is, 
if K and K' are equivalent CRGs, then gK{p) — gK'{p) for all p G [0, 1]. 



We can use both the / and g functions of CRGs to compute the edit distance 
function. 

Theorem 5 ([5^). For any nontrivial hereditary property %, 

ednip) = inf gxip) = inf /k(p)- 

KeK{H) KeKCH) 

Remark 6. Marchant and Thomason [13 prove that, in fact, ed-H{p) = umij^(z]Q(ji\ gK{p)- 
That is, that for every p G [0, 1], there is a ORG, K G K,{'H), such that ed-H{p) = 
9k{p)- 



A sub-CRG, K' , of a CRG, K, is a component if, for all v G V{K') and all 
w G V{K) — V{K'), the edge vw is gray. Theorem [t] allows the computation of gK 
from the g functions of its components. 

Theorem 7. Let K be a CRG with components K^^\ . . . , K^^\ Then 

e 

{9k{p)T^ = Yl (9k(^^ (p))~^ ■ 
1=1 

Proof. The matrix M^(p) is a block-diagonal matrix. Let gi = .9 /<-(;) (p) for i = 
!,...,£ and g = gxip)- We may first assign the total weights of the vertices in each 
component. Then, the relative weights of the vertices in each component is defined 
by that component's g function. 



Let ai denote the total weight that the optimal solution of (11) assigns to the 



vertices of K^'^' . Then, we obtain the following optimization problem 



nin 


algi H ^ ajgi 






s.t. 


ai-\ h «£ 


= 


1 




ai, . . . , Off 


> 






Using the method of Lagrange multipliers, we see that the solution is a, = X/gi 
for i = 1, . . . ,£ and A^^ == J2i=i 9i^ ■ Substituting these values gives the theorem 
statement. D 
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Theorem [7] can be applied directly to CRGs that have only gray edges. Since 
the g function for a white vertex is p and the g function for a black vertex is 1 — p, 
we have Corollary [Sj 

Corollary 8. If K is a CRG all of whose edges are gray, then 

f\YW{K)\ \VB{K] 



9k{p) 



\ P 1 -P 



Proposition |9] gives the edit distance function for some special CRGs that have 
no gray edges. 

Proposition 9. Let K be a CRG on k vertices and no gray edges as follows: 



• 



• 



// all vertices are white and all edges are black, then gn (p) == min{p, 1 

p+{2p-l)/k}. 

If all vertices are black and all edges are white, then gxip) = niin{p+ (1 

2p)/k,l~p}. 



3. Estimation of the edit distance function 

Denote K{r, s) to be the CRG with r white vertices, s black vertices and all gray 
edges. Let "H be a hereditary property with H — C\HeJ^iH) Forb(_ff). The notion of 
(r, s)-colorability is discussed by Alon and Stav ISj where they focus on hereditary 
properties that are complement-invariant. 

The chromatic number of %, denoted xiTi) or just Xi where the context is 
clear, is min{x(iJ) : H e J-'('H)}. The complementary chromatic nMrnfterrl of H, 
denoted x(^) or %, is min|x(_ff) : H G J-{T-C)\. The binary chromatic number is 
max{fc + 1 : 3r, s, r + s = fc, iJ i/> K{r, s),\lH G F{l-L)}. 

The clique spectrum of % is the set 

rCH) =^ {(r, s):H^ K{r, s)yH G -F(H)} . 

The clique spectrum has a number of useful properties. For example, it is monotone 
in the sense that if (r, s) G r(?^) and < r' < r and < s' < s, then (/, s') G r('H). 
As a result, the clique spectrum of a hereditary property can be expressed as a 
Young tableau. An extreme point of the clique spectrum F is a pair (r, s) G F for 
which both (r + 1, s) ^ F and (r, s + 1) ^ F. Let F* denote the extreme points of 
clique spectrum F. Figure [3] shows the clique spectrum of the cycle Cg expressed 
as a Young tableau, with the extreme points of the clique spectrum marked. 

3.1. Approximating ed-H(p) by 7«(p). CoroUaryjsjgives that .g^^ (^^s) (p) = r{l~p)+sp ^ 
which follows directly from Theorem^ Define the function j-uip) as follows: 

7«b) = min {gKirMP) ^ i^' ') ^ r(H)} = min 1^-^^-^ : (r, s) G T{n) 

Unfortunately, the term "cochromatic number" is taken. It should be noted that the cochro- 
matic number, although its definition resembles that of xs> is not the same parameter. 
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(0,4) 



(1,2) 



(2,0) 



Figure 1. The clique spectrum of Cg expressed as a Young 
tableau. The extreme points of the clique spectrum are labeled. 



Clearly, ed-uip) < 7«(p). Moreover, 7«(p) = min {gK{r,s){p) ■ {r,s) £r*{n)}; 
i.e, only (r, s) that are extreme points of the clique spectrum need to be used 
to compute 7. The value of the function "fuip) is that it is computable for any 
hereditary property. 

3.2. Basic observations on ed-u{p). The following is a summary of basic facts 
about the edit distance function. Item ( [m] ) comes from Alon and Stav [2j. Item 
( pv| comes from "5]. The remaining items are trivial. 

Theorem 10. Let % he a nontrivial hereditary property with chromatic number %, 
complementary chromatic number x, binary chromatic number xb o-nd edit distance 
function ed-j-iip)- 

(i) Ifx > 1, then ednip) <p/(x- !)• 

(ii) //x > 1, then ednip) < (1 -p)/(x- !)• 

(iii) edn{l/2) = 1/{2{XB - 1)) = 7«(l/2). 

(iv) edfiip) is continuous and concave down. 

(v) ed-uip) = ed^{l~p). 



There are a number of immediate corollaries of Theorem [TO] that help estimate 
the edit distance functions. Some of the most useful are summarized in Corollary |11| 
and we leave the proof of them to the reader. 

Corollary 11. Let T-L be a nontrivial hereditary property with binary chromatic 
number xb- Let {r,s) be extreme points in the clique spectrum of % such that 
r + s^XB- 

(i) Ifx = XB, then ednip) ^p/ix-'^) for all pe [0,1/2]. 
(ii) IfXB=X, then edn{p) = {l-p)/{XB - 1) for all p e [1/2,1]. 
(iii) Ifr>s, then p*^ > 1/2. 
(iv) Ifr<s, then p'!^ < 1/2. 
(v) For any {r,s) in the clique spectrum, dn < (\/^+ \/^)^^- 



4. -H C Forb(A'ft) 

In this section, we prove Theorem [ll which bounds the edit distance func- 
tion for hereditary properties that have no copy of a complete graph. Note that 
n C FoTh{Kh) if and only if K^ € F{H). 
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Proof of Theorem III Since x(^) — 1 and H is not trivial, x(^) > 1- If 
K G IC{H), then iiT cannot have a black vertex, otherwise Kf^ i-^ K. So, we may 
assume that K G /C('H) has all white vertices. In every set of x white vertices, 
there must be a non-gray edge. By Turan's theorem, this means that K has at 
least (2) — ^5i • V non-gray edges. Hence, 



ed-uip) > fKip) > ,2 



> 



1 
k 
min{p, I — p} 



pk + 2 min{p, 1 — p} 



2 Y-1 2 



x-1 

In every set of m white vertices, there must be a white edge. Again, by Turan's 
theorem, edfi{p) > fK{p) ^ jy^_i ■ So, ed-u{p) is bounded below by both p/{x — 1) 

and the line segment connecting the points I 1/2, ^, \^^ j and ( 1, ;^jj^ ) . Hence, 

edu{p) > mm <^ -, H 

As to the upper bound, we give two CRGs into which no iJ G J'i'H) can map. 
The first is iiT^^) — K{x — 1, 0), the CRG with x ^ 1 white vertices and all edges 
gray. By Corollary |8) gxwip) ^p/ix- !)■ 

The second CRG, K'''^\ is to — 1 white vertices and all black edges. If there 
were some H G J-iJ-i) such that H i-> K^'^\ then H would be a complete (to — 1)- 
partite graph, which is forbidden by our choice of to. By Proposition^ 5/f(2) {p) = 
min{p, \ — p-\- (2p — 1)/(to — 1)}. So, 

2p - 1 ■ 



erf-H(p) < min<^ -,p, 1-p, 

The final statement comes from the observation that if "H = Forb(iir^), then 
X = rri = h. D 



By Theorem |10[|v[) we have the similar result for empty graphs: Let "H be a 
nontrivial hereditary property such that J^('H) contains an empty graph and let h 
be the minimum positive integer such that H C Forb(A';j). Let x b^ the comple- 
mentary chromatic numbein of H and m be the smallest positive integer such that 
J^iT-L) contains a to. disjoint cliques. Clearly, x ^ "^ ^ ^- 

p , i-2p i-p\ ^ , , .^ . r , i-2p i-p 

< erf-H yp) < mm < p 



X— 1 771— Ix^lJ I TO — Ix^ 

In particular, edp^^^g^~^{p) = |^. 

5. The p-coRE CRGs 

Recall that, in Remark [6] we observed that ed-uij)) = min{gi^(p) : K G /C(H)}. 
That is, for any hereditary property % and p G [0, 1], there is a CRG, K G /C('H) 
such that ed-u{p) = gnip)- This is found by looking at so-called p-core CRGs. A 
CRG, K, is a p-core CRG, or simply a p-core, if gK{p) < 9k'{p) for all nontrivial 
sub-CRGs AT' of AT. 



The term x{H) is, the smallest number, fc, such that no member of J^{'H) can be partitioned 
into k cliques. In fact, x(W) = xCH)- 
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Moreover, p-cores can be easily classified: 
Theorem 12 (Marchant-Thomason, 13J). Let K he a p- core CRG. 

• If p = 1/2, then K has all of its edges gray. 

• Ifp < 1/2, then FiJi{K) = and there are no white edges incident to white 
vertices. 

• Ifp > 1/2, then EW(ii') — and there are no black edges incident to black 
vertices. 



The optimal solution to the quadratic program in ( 11 ) is, in some sense, regular, 
as described in Theorem [TSJ Theorem [T3| is the "symmetrization" referenced in the 
titlelj The fundamental observation is that if every optimal solution, x*, of (11) 
has no zero entries, then 

(12) Mk(p)-x*=5k(p)1, 

where 1 is the all-ones vector. Of course, an optimal solutions having no zero entries 
corresponds, by definition, to a CRG being p-core and that the optimal solution to 



quadratic program in (11) is unique 



By Theorem 12 ii K is a p-coie CRG, then no edge has the same color as either 



of its endvertices, so we can reinterpret ( 12 ) as follows 



Theorem 13 (Marchant-Thomason, [S]). Let K be a p-core CRG. There is a 



unique vector x that is an optimal solution to the quadratic program in (11). For 
all V G V{K), let the entry of :x. corresponding to v be x(w). For each v G V{K), 

9k{p) =x(w) [pdw{v) + {l-p)dB(v)] , 



where 



dw(w) 
dB(«) 



x(w), ifveYW{K): 

E..eEW(K)x(^): «/t'eVB(if); 

x(«), ifveYB{K); 

E..eEB(K)X(z), ifveYWiK). 



6. Computing edit distance functions using symmetrization 

Theorem [TS} Theorem [T2j Remark [6] and the definition of p-cores have all of the 
elements in order to express dG(u) := 1 — dw(w) — dB(i^) for any vertex v in ap-core 
CRG. It is often useful and intuitive to focus on the gray neighborhood of vertices. 

Lemma 14. Let p G (0, 1) and K be a p-core CRG with optimal weight function 



X. 



(i) Ifp < 1/2, then, x(i;) = gK{p)/p for all v G YW{K) and 

dG(w) = P~9k{p) ^ 1^2p^^^^^ ^^^ ^^^ ^ ^ YB{K). 

p p 

(ii) Ifp > 1/2, then x(u) = gK{p)/{l-p) for all v G YB{K) and 

dG(«) = ^~^~g^'(P) + ^^x(«), for all V G VW(/^). 

1 — p 1 ^ P 



Pikhurko .14. uses this term for the approach by Sidorenko |19| . 
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Proof. Wc will prove the case for p < 1/2. The case where p > 1/2 is symmetric. 



Let V e YW{K). By Theorem 12 all vertices are incident to v via a gray edge, 



and by Theorem 13 gK{p) ~ px.{v). Now let v e YJi{K). By Theorem 12 i; has 
no black neighbors and 

9k{p) = p(l - x(t;) - dciv)) + (1 - p)x(w). 

Solving for dG(f) gives the result. D 



Lemma 15. Let p G (0, 1) and K be a p-core CRG with optimal weight function 

X. 

(i) Ifp < 1/2, then x(u) < 5k(p)/(1 -p) for all v G YB{K). 
(ii) Ifp > 1/2, t/ien x(w) < gK{p)/p for all v G YW{K). 



Proof. We use the fact that x{v) +dQ{v) < 1. Applying Lemma 14 and solving for 



x(v) gives the result. D 



Remark 16. From this point forward in the paper, if K is a CRG under consid- 
eration and p is fixed, x(u) will denote the weight of v € V{K) under the optimal 



solution of the quadratic program in equation (11) that defines gK- 



7. Forb(C,,), /iG{3,...,9} 

Thomason [50] reports that Ed Marchant has found the edit distance function 
for C5 and C7. Here we find the function for all C/i, /i G {3, . . . , 9}. The proofs in 
this section might be substantially similar to Marchant's. 

In order to compute the edit distance function for cycles, we first make the 
observation that C3 is a complete graph and so Theorem [T] gives Corollary \V7\ 

Corollary 17. 

erfForb(C3)(p) =-P/2- 
Furthermore, the only p-core for which this is achieved for p G (0, 1) is K(2, 0). 

For Ch, h > 4, we first take care of easy cases so that the only p-cores that 
need to be considered have all black vertices. We use Lemma [I8] which establishes 
the upper bound and eliminates all cases except when p < 1/2 and all vertices are 
black. 

Lemma 18. Let h> 4 and p G (0, 1). 

pil-p), ifh = 4.; 

7Forb(c.)(p) = <! ™"{ i-P+aV3l-i)p 'W2FT}' tfh>6tseven;and 

i,/2]-i I ; ifhis odd. 



p(i-p) 



2' l-p+([/i/3]-l)p' lh/2 



Furthermore, if there is a p-core CRG, K G /C(Forb(C/i)) such that gxip) < 
7Forb(C(i)(p) for any p G (0, 1), then p < 1/2 and K has all black vertices. 
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Proof. We leave it to the reader to verify that the extreme points of the chque 
spectrum of Forb(C/J are (0, \h/2'\ - 1), (1, [/i/3] - 1) and, if h is odd, (2, 0). This 
estabhshes the value of jForh{Ch)ip)- 

If ft, = 4, the classes of possible CRGs are restricted. If K has at least 2 white 
vertices, they are connected via a gray or black edge and so C4 would embed in K. If 
K has a white and at least two black vertices, then the edges between the white and 
black vertices are both gray and the edge between the black vertices is either gray 
or white and so C4 would embed in K. Thus, if K has a white vertex, then it has 
at most one black vertex and this is K{1, 1), the CRG that defines 7Forb(C4)(p) = 
p{l—p). If iiT has all white edges, then (7;i:(p) — inin{p+{l — 2p)/k,l—p} > p{l—p). 

So, edporh{Ci){p) = P(l -P)- 

Now, let h > 5. Since 7^^(1/2) — ed-u{l/2) for all hereditary properties and 

= 7Forb(Cfe)(l), convexity gives that eduijp) = JKJlf=i^ ^^^ ^^^ P ^ 1/2- 

Finally, let p e (0, 1/2) and K he a, p-core CRG such that Ch ^ K. If if has 
only white vertices and h is even, then K « K{1, 0) and gxip) — P > luip)- If K 
has only white vertices and h is odd, then there are at most 2 white vertices and 
9k{p) > p/2 with equality if and only if K « K{2, 0). 

If K has both white and black vertices, then it has at most 1 white vertex because 
Ch I— >■ K{2, 1). Furthermore, it can have at most [/i/3] — 1 black vertices. To see 
this, denote the vertices of Ch by {0, 1, . . . , /i — 1} where 0^1^---^h — 1^0. 
Let S consist of the members of {0, . . . ,h — 2} that are divisible by 3. If /i — 1 
is divisible by 3, then add ft, — 2 to S". The graph Ch — S has [ft/3] connected 
components, each of which are cliques of size 1 or 2. Thus, regardless of whether 
the edges are white or gray, there are at most [ft/3] — 1 black vertices in K and 
gK{p) > i-^p+l^h/3\-i)p ' "^'^^^ equality if and only if if « K{1, [ft/3] - 1). 

Summarizing, Up G (0, 1/2) a.ndgK{p) = erfForb(Cfe)(p)> then if is either if (0, [ft/2]- 
1), if (1, [ft/3] — 1), if (2, 0) and ft is odd, or if has all black vertices (and white or 
gray edges). D 



From this point forward, we only restrict ourselves to p G (0, 1/2) and CRGs, 
if, with only black vertices and white or gray edges because of Lemma [18) We can 



immediately address 4- and 5-cycles. Corollary [19^ and Corollary 20 have appeared 



before. Corollary [T9| was proven in the proof of Lemma 18 
Corollary 19 (Marchant-Thomason 13 J. 



Corollary 20 (pjj). 



edporhioip) ^ min \ -, ^— 



Proof. Thanks to Lemma 18 we can restrict to p G (0, 1/2) and p-core CRGs 
if G /C(Forb(C5) for which the vertices are black. Let Vi have largest weight in 
if and V2 have largest weight in Nq{vi). Let g denote gxip)- Since if has no 
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triangles, 



dG(«l)+dG(^'2) < 1 

P P 

(x(t;i) +x(i;2)) < . 



So, g > p/2, a contradiction. 



D 



See Figure [2] and Figure |3] 





Figure 2. Plot of 

edForb(C4)(p) =P(1- 
p). The boundary of 
the shaded region is 

edForhiCi){p)- 



Figure 3. Plot 

of edporh{Ci,){p) = 

min{p/2, (1 -p)/2}. 



Proposition 21 shows that in order to find CRGs with black vertices, white or 



gray edges with no Ch, there are many lengths of gray cycles that are forbidden in 
the CRG. 

Proposition 21. Let p E (0, 1/2) and K be a p-core CRG such that K has black 
vertices and white and gray edges. If Ch ^ K then K has no gray cycle with length 
m{\h/2\,...,h) 

Proof. If Ch I— > K, then each vertex of K receives either one or two vertices that are 
consecutive on the cycle. Thus, the cycle K must contain is one that corresponds to 
the contraction of edges of Ch that map to a single black vertex of K . Since these 
edges form a matching, the cycle required to be in K has length at least \h/2\ and 
at most h. D 



In order to deal with Forb(C/i) for ft, > 6, we use Proposition 21 along with 



two major lemmas. Lemma [22^ is a general structural lemma and the results on 
Forb(Ch) that we give are immediate corollaries. It should be noted that if we write 
that a CRG, say, "has no gray 4-cycle," we mean so in the subgraph sense, so it 
does not contain a gray K/^ cither. 

Lemma 22. Let p G (0, 1/2) and K be a p-core with black vertices and white or 
gray edges. 

(i) If K has no gray edge, then gnip) > P- 
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(ii) If K has neither a gray 3-cycle nor a gray A-cycle, then gxip) > p(l ^p)- 

(iii) If K has no gray 3-cycle, then gxip) > p/2. 

(iv) If K has a gray 3-cycle, but no gray C^ (that is, four vertices that induce 

5 gray edges), then gK{p) > min{2p/3, (1 — p)/3}. 
(v) If K has no gray A-cycle, then gxip) > p(l ~ p) for p £ (0, 1/3). 
(vi) // K has a gray C^ but no gray C^^ (that is, five vertices that induce 

some 5-cycle with two chords), then gK{p) > inin{2p/3,p(l — p)/(l +p)}- 

(vii) If K has a gray chordless A-cycle, but no gray K^^ (that is, a ii'3,3 missing 
an edge), then gxip) > niin{2p/3, 2p(l — p)/(2 +p)}. Note that K^^ has 
a 6-cycle as a subgraph. 

Proof. For ease of notation, in calculations, we sometimes let g denote gxip)- 

(i) If K has no gray edges, then for any v E V{K), g ^ p+ {1 — 2p)x{v) > p. 

(ii) Let vq e V{K) have the largest weight and Ng{vo) — {xi, . . . ,xe}, the 
gray neighborhood of vq. Let Xi — :x.{vi) ior i — 0,1, . . . ,£. Since there 
are no gray triangles, there are no gray edges in Nq{vo) and since there 
are no gray quadrangles, NQ{vi) — {vq} and Nq{vj) — {vq} are disjoint for 
all distinct i,j G {!,...,£}. So, {vq}, Nq{vo) and each Ncivi) — {vq}, 
i = 1, . . . ,i form a family of ^ + 2 pairwise disjoint sets. 



xo + <iGM+^[<iG{vi)-xo] < 1 
e 



i=l '- 

P- 9 



P 



P P 

-Xo 



< 1 



1 — 2n 

+ -dcivo) < 1 

P 



1-p 

Xo H dG(wo) +^ 

P 



P- 9 
P 



Xo 



< 1. 



Since xq is the largest weight, £ > dG{vo)/xQ and as long as g > p{l—p), 
we have ^^^ — xq > ^^^ — j^ > by Lemma 15 |i|. Consequently, 



1-p , , ^ , dG(wo) 
xo-\ dG(wo) 



P- 9 
P 



xl + dcivo) 



Xo 



P" 9 , 1 - 2p 



Xo 



-Xo 



p — g 1 — 2p 



-Xo 



(13) 



P- 9 
P 



2 . P~g l-2p _ ^ 
P P 



Xo 



1 -2p 



< 1 



< Xo 



< 



Xo 



xl < 0. 
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A quadratic expression of the form c + bx + ax^ with a > has a 
minimum value of c — 6^/(4o). 

/ \ 2\ — 

4(1 



^ 2 

l~2p 



p J \ p 



So, 



< i(-i^ + ,/fi^V + i 



9 > i (l - x/1 - 4p + 5p2 

This expression is greater than p(l — p) for aU p e (0, 1/2). 



(iii) By (li]), we may assume that K has a gray edge, otherwise gxip) > P- 
Let W1W2 be a gray edge and x^ — x(wi) for z = 1,2. Since they have no 
common gray neighbor, 

dG(t'i) + dG(t'2) < 1 

H (a;i+a;2) < 1 

P / P 

Since xi + a;2 > 0, we have g > p/2. 

(iv) Let {wi, f2, W3} be a gray triangle in K where Xi = x{vi) for i = 1,2, 3. Be- 
cause no pairs of Vi can have a common neighbor other than the remaining 

Vj, 

3 

'^[dciVi) ~ {X1+X2+X3- Xi)] + {xi+X2+X3) < 1 



i=l 



'^Aclvi)- {xi+X2 + xz) < 1 



'i\^^^] + ^^—^{xi+X2+X^) < 1 

\ p J P 

2p 1 - 3p , 

y H ^ — (a;i +X2 + X3) < g. 

If p < 1/3, then g > 2p/3. If p > 1/3, then xi + a;2 + Xg < 1 implies that 

g>ii-p)/3. 

(v) Let vo G yiK) have the largest weight. Since there are no gray quadran- 
gles, no member of Ng{vo) has more than one gray neighbor in Ng{vo). 
Let Ng{vo) = {xi,x[,..., x^,x'^} U {x2m+i, ■■■, xi}, the gray neighbor- 
hood of Vq such that for i — 1, . . . ,m, Xix[ is a gray edge. Let Xi = x(i)i) 
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for i = 0,1, . . . ,£. Since there arc no gray quadrangles, the gray neighbor- 
hoods outside of {vq} U Nq{vq) of distinct vertices in Nq{vq) are distinct. 
Hence, 

m 

xq + dG(wo) + ^ [dcivi) + dG(w-) - Xi-x'i- 2.to] 









+ Yl [dcl^'j) -.^o] 


< 1 


j=2m+l 




(vo) + e 


p-g 

.To 

P 






- 1 i'Th 

j=2m+l \ -f^ / 


< 1 


'p-g 
p 


Xq 


+ xo + dG(wo) + I dG(wo) 

\ P J 


< 1 



Again, we use the fact that £ > dQ{vo)/xo and ^— ^ — a;o > 0. 



dG(wo) 



Xq 



p-g 



p 



p-g 
p 



p-g 2 - 5p 



- 1 



p 



p 



Xq 



Xq 



■Xq 



l-2p 



l-2p l-3p 



dcK) < 1 



xi < 0. 



Optimizing over xq, 



p-g 



P 



p-g \ I 2-5p 



4((^j(^)+l 



< 



p-g 
p 



l-2p l-3p , f2-5p 



_2^p-^.2-5p_^ < 



So, 






2 ('^^^ V^^^l - 1 > 0. 



< 



1 / 2-5p 



2-5p 



5 > 3((l-p)-v/l-5p + 7p2 
Some calculations show that g > p{l — p) for p E (0, 1/3). 



(vi) Let the gray C^ be denoted {wi, W2, ws, W4} such that all edges are gray 
except, perhaps W1W3. Let Xi — x(wi) for i — 1,2,3,4. Without loss of 
generality, let X2 > X4. 
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No pair (vi,Vj) can have a common gray neighbor except, perhaps 
{v2, V4). Denoting Nq{v) to be the set of gray neighbors of vertex v, the 
sets Ng{vi) - {u2,W4}, Ng{v3) - {w2,W4} and NqM - {vi^v^^Vi} must 
be disjoint. So, 

(dG(wi) -X2- X4,) + (dclvs) -X2~ X4,) 

+ (dG(w2) - xi - 0:3 - X4) + (a;i +a:;2 + 0:3 + a;4) < 1 

o- 1 (a:;i+a;3)H a;2 — 2x4 < 1 

P P P 

I — 2p 1 — 3p 3(7 

2-\ [Xi+X^)^ .T2 - 2.T4 < . 

P P P 

Solving for g, 

2p 1 - 2p , . 1 - 3p 2p 

9 > yH 7^ (Xi+X3) + X2-yX4 

2p l^-2p , , l-5p 

> yH j^ — (a;i+X3) + — ^^ — X2. 

If p < 1/5, then g > 2p/3. If p > 1/5, then we use Lemma 15 il), which 
gives that X2 < g/{l — p)- So, 

2p l-2p , , 1 - 5p ^ .g 

p(l-p) , (l-2p)(l-p) ^ 

- "TT^+ 2(1 + p) (^^ + ^^)- 

Consequently, 5 > p(l — p)/(l + p). 

(vii) Let the gray 4-cycle be denoted {wi, ^2, ^3,^4} such that all edges are gray 
except V1V3 and ?;2f4. Let Xi = x(wi) for i = 1, 2, 3, 4. If both pairs {vi, v^) 
and (w2, W4) have common neighbors outside of {vi, V2, fs, V4}, then a i4r3_3 
is formed. So, suppose V2 and 'y4 have no common neighbors other than 
vi and V3. Without loss of generality, let 2:2 > 2:4 . 

The sets Ng{vi) - {w2,W4}, ^^0(^3) - {i'2,f4} and NqM - {wi,W3} 
must be disjoint. So, 

(dG(wi) - X2 - X4) + (dG(w3) - X2- X4) 

+ (dG(w2) - a;i -2:3) + (a;i +a;2 +.X3 + a;4) < 1 

0P-.9 l-2p, l-3p 

o 1 (xi+a;3JH a;2 — 0:4 < 1 

p p p 

o , l-2p ^ , , l-3p 35 

2H (xi+.T3)H X2 — X4 < — . 

P P P 

Solving for g, 

2p l-2p^ , 1 - 3p p 

g > y + — 7^ — {xi+X3)~\ — a;2 - -a;4 

2p 1^2p , , l-4p 
> y + — 7^ — (xi+x^)^ — X2. 
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li p < 1/4, then g > 2p/3. li p > 1/4, then we use Lemma 15 il), which 
gives that X2 < g/{l — p). 

2p l-2p l-ipf g 

9 > -IT -\ 5 {Xi+Xi} + 



3 3 ' ^ 3 \l-p 

2p{l-p) , (l-2p)(l-p) ^ 

- ^T^ + 2T^ ("^ + "^)- 

Consequently, 5 > 2p(l — p)/{2 +p). 
This concludes the proof of Lemma [22] D 



Corollary 23. 



1 — p 



Proof. Lemma 18 gives that the function stated above is 7Forb(C6) iP) ^^^ ^^ £dpQ,.]^i(j^\{p) < 
min {p(l — p), -5^}- By Lemma 18 we only need to consider p £ (0, 1/2) and K 
being a black-vertex p-core CRG m ]C{FoTh{Ce)) for which gxip) < 7Forb(Ci3)(p) 



By Proposition I21I X has neither a 3-cycle nor a 4-cycle. Lemma 22ln|) gives that 



9k{p) ^ p(l ~p)- So, there is no such K and the corollary follows. D 



Corollary 24. 

erfForb(C7)(p) = mm <^ -, -^-^ , -^ 

Proof. The function stated above is 7Forb(C7)(?')- Let p G (0, 1/2) and suppose K 
is a black-vertex p-core CRG in K,{Forh{C'j)) for which gxip) < 7Forb(C7)(p)- By 
Proposition [21] K has no gray 4-cycle. 

Since K has no gray 4-cycle, then by Lemma [22|pl|) , either gxip) > p(l ~ p) or 
K has a gray 3-cycle. In terms of the former, it is trivial that this is a contradiction 
to gxip) < 7Forb(C7)(p) for P & (0, 1/2), so we assume that G has a gray 3-cycle. 

If K has a gray 3-cycle but no C^, then by Lemma 22 ivj), we have gxip) > 



min{2p/3, (1— p)/3}. Straightforward calculations verify that this is a contradiction 
to gK{p) < 7Forb(C7)(p) for p e (0, 1/2). D 



Corollary 25. 

, / X . fp(l--P) 1--P 

eaForb(C8)(P) =mni 



1+P 3 

Proof. The proof is the same as for Corollary [24] D 



Corollary 26. 

■p 1 -p 

2 



erfForb(C9)(p) = mm< ; 
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Proof. The function stated above is 7Forb(C9)(p)- Let p e (0, 1/2) and suppose K 
is a black-vertex p-core CRG in IC(FoTh{Cg)) for which gxip) < 7Forb(C9)(p)- By 
Proposition 21 K has no gray C^^ . 

Since K has no gray C^ , then by Lemma 22 vi|, either gnip) > min{2p/3,p(l— 
p)/{l + p)} or K has no gray C^ . In terms of the former, straightforward calcula- 
tions verify that this is a contradiction to gxip) < lForhiCg){p) for p E (0, 1/2), so 
we assume that G has no gray C^ . 

If K has no gray C^, then by Lemma 22 iy|, either gxip) 
p)/3} or K has no gray 3-cycle. In terms of the former 
contradiction to 5k(p) < 7Forb(C9)(p) for P G (0, 1/2) 

gray 3-cycle. If that is the case, however. Lemma [22)pli| ) gives that gxip) > p/2, 
a contradiction. So, there is no such K for which gxip) < 7Forb(C9)(p) ^^id the 
corollary follows. D 



> min{2p/3, (1 - 
it is trivial that this is a 
so we assume that G has no 



Corollary 27. 

erfForb(Cio)(p) 



P(l-P) 1-P 

H-2p ' 4 



z/pG [1/7,1]. 



Proof. The function stated above is 7Forb(Cio)(-P)- Let p G (0, 1/2) and suppose K 
is a black-vertex p-core CRG in /C(Forb(Cio)) for which gxip) < 7Forb(C9)(p)- By 
Proposition 21 K has no gray C^^ . 

Since K has no gray C^"*", then by Lemma 22 vi|, either gxip) > min{2p/3,p(l — 
p)/(l + p)} or K has no gray C^ . In terms of the former, straightforward calcula- 
tions verify that this is a contradiction to gK^p) < 7Forb(Cio)(p) for P G [1/7, 1/2), 
so we assume that K has no gray C/. 

If K has no gray C^, then by Lemma 22 ivl), either gxip) > min{2p/3, (1 — 
p)/3} or K has no gray 3-cycle. In terms of the former, it is trivial that this is a 
contradiction to gxip) < 7Forb(Cio)(p) for P G [1/7, 1/2), so we assume that K has 
no gray 3-cycle. 

If K has no gray 3-cycle, then by Lemma 22 nl), either gxip) > p(l — p) or K 
has a gray 4-cycle. In terms of the former, it is trivial that this is a contradiction 
to gxip) < 7Forb(Cio)(p) for P G (0, 1/2), so we assume K has a 4-cycle, but since 
it cannot be C^ , it must be a gray chordless 4-cycle. 

If K has a chordless gray 4-cycle, then by Lemma 22 vii), either gK{p) > 
min{2p/3, 2p(l — p)/(2 +p)} or K has a gray K^^. In terms of the former, straight- 
forward calculations verify that this is a contradiction to gxip) < 7Forb(Cio)(p) 
for p G [1/7, 1/2), so we assume that K has a gray K^^. However, as observed in 



Lemma 22 this contains a gray 6-cycle, which is a contradiction to K G /C(Forb(Cio)). 

D 

Remark 28. See Figures \^[8\ for plots of the edit distance functions described in 



Corollaries 23 24 25 26 and\2_ 



8. Conclusions 

8.1. Forb (G(no,po))- We provide a conjecture with some interesting implications. 
Recall that G{n,p) denotes the Erdos-Renyi random graph on n vertices with edge- 
probability p. The hereditary property T-L = Forb(G'(no,Po)) is a random variable. 
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Figure 4. Plot 

of edForb(C6)(p) = 

mm{p{l - p),(l - 




Figure 5. Plot 

of erfForb(C7)(p) = 

imn{p/2,p(l — 

p)/il+p),il-p)/i}. 




Figure 6. Plot 

of ec;Forb(C8)(p) = 
min{p(l — p)/{l + 

p),(l-p)/3}. 




0.0 0. 



Figure 7. Plot 

of edporh{Co){p) = 

min{p/2,(l-p)/4}. 




Figure 8. Plot of erfForb(Cio)(p) = min{p(l - p)/{l + 2p),{l 
p)/4}. An upper bound for p < 1/7 is also on the graph. 



Conjecture 29. Fix po e (0, 1) and let % ~ Forb(G(no,po))- Then 

P 1 -P 



edfi[p) = (1 + o(l)) mm 



Tlo 



-Iog2(l--Po)' -logzPo, 
with probability approaching 1 as hq -^ oo. 

The functions that define this bound are of the formp/(x— 1) and (1— p)/(x— 1) 



Conjecture 29 was proved for the case po = 1/2 by Alon and Stav [5]. If it is 
true in general, then it implies that p^ — , °^ 1-7-° -i ; which is only equal to po 
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itself when po G {0,1/2,1}. Recall that ed-uip) = hm„^oo dist(G(n,p),H)/(2) and 
it achieves its maximum at p^. Informally, the conjecture implies that it is harder 
to edit away copies of G{nQ,pQ) from G{n,p'^) than it is from G{n,po). This seems 
to be rather counterintuitive. 

If Conjecture [29] is false, then it implies that there is more information about 
the structure of random graphs than is revealed by just the chromatic numbers. 

8.2. Thanks. I would like to thank Maria Axenovich and Jozsef Balogh for conver- 
sations which have improved the results. I would like to thank Andrew Thomason 
for some useful conversations and for directing me to [13]. I would also like to 
thank Tracy McKay for valuable discussions which deepened my understanding of 
previous results. 

Thank you to Ed Marchant for finding an error in a previous version of this 
manuscript. 

Figures are made by Mathematica and WinFIGQT. 
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